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We employ the dependently-typed programming language Agda2 to explore formalisation of untyped and 
typed term graphs directly as set-based graph structures, via the gs-monoidal categories of Corradini and 
Gadducci, and as nested let-expressions using Pouillard and Pottier's NotSoFresh library of variable-binding 
abstractions. 

1 Introduction 

The Coconut project [AK09a, AK09b]| uses "code graphs" BKAC 061, a variant of term graphs in the spirit 
of "jungles" IHP9T1 ICR931 , as intermediate presentation for the generation of highly optimised assem- 
bly code. This is currently implemented in Haskell, and we use the Haskell type system in an embedded 
domain-specific language (EDSL) for creating such code graphs via what appears to be standard Haskell 
function definitions, with let-definitions introducing sharing, and with functions representing assembly- 
level operations constructing hyperedges [AK09a|. However, since Haskell does not support full depen- 
dent typing, the intermediate term graph datatype interface, supporting graph navigation, traversal, and 
manipulation operations, cannot preserve the connection with the Haskell-level typing of the assembly 
operations. Therefore, although EDSL-created code graphs are well-typed by construction, as certified 
by the type checker, this does not hold anymore for code graphs that are the result of internal operations. 
Those internal operations either require separate proof that they preserve well-typedness, or they need to 
perform run- time checks, at considerable run-time cost. 

In addition, our code-graph-creation EDSL has a second "simulator" implementation, which turns 
the EDSL expressions into Haskell functions that implement a "machine simulation". Since the code 
graph representation has lost its connection with the Haskell-level typing, it is "unintuitively hard" to use 
the simulation machinery for code graphs that result from code graph manipulation operations. 

Mainly for these reasons, we are now exploring implementation of code graphs in a dependently 
typed programming language, where there is no need to "loose" the type information when moving to a 
graph representation, and where even stronger assertions about operations on code graphs than just type 
preservation can be proven inside the implementing system. 

We start, in Sect. |2j with a quick introduction to the dependently typed programming language (and 
proof checker) Agda [Nor07 1. This is followed by formalisations of set-based mathematical definitions 
of untyped (Sect. [3]) and typed (Sect. 01) term graphs, and then a summary of the gs-monoidal category 
view on these term graphs in Sect. |5] Finally, we present two formalisations of acyclic term graphs as 
(differently structured) nested let-expressions (Sections [6] and IT]). 

2 Introduction to Agda: Types, Sets, Equality 

The Agda home pagd3 states: 

1 http : // wiki . portal . Chalmers . se/ agda/ 
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Agda is a dependently typed functional programming language. It has inductive families, 
i.e., data types which depend on values, such as the type of vectors of a given length. It also has 
parametrised modules, mixfix operators, Unicode characters, and an interactive Emacs interface 
which can assist the programmer in writing the program. 

Agda is a proof assistant. It is an interactive system for writing and checking proofs. Agda is 
based on intuitionistic type theory, a foundational system for constructive mathematics developed 
by the Swedish logician Per Martin-Lof. It has many similarities with other proof assistants based 
on dependent types, such as Coq, Epigram, Matita and NuPRL. 

Syntactically and "culturally", Agda is quite close to Haskell. However, since Agda is strongly normal- 
ising and has no _L values, the underlying semantics is quite different. Also, since Agda is dependently 
typed, it does not have the distinction that Haskell has between terms, types, and kinds (the "types of the 
types"). The Agda constant Set corresponds to the Haskell kind *; it is the type of all "normal" datatypes. 
For example, the Agda standard library defines the type Bool as follows: 

data Bool : Set where true : Bool 
false : Bool 

Since Set needs again a type, there is Seti, with Set : Seti, etc., resulting in a hierarchy of "universes". 
Since version 2.2.8, Agda supports universe polymorphism, with universes Set i where i is an element of 
the following special-purpose variant of the natural numbers: 

data Level : Set where zero : Level 

sue : (i : Level) — > Level 

With this, the conventional usage turns into syntactic sugar, so that Set is now Set zero, and Seti = 
Set (sue zero). For example, the standard library includes the following universe-polymorphic definition 
for the parameterised Maybe type: 

data Maybe {a : Level} (A : Set a) : Set a where just : (x : A) ->■ Maybe A 

nothing : Maybe A 

Maybe has two parameters, a and A, where dependent typing is used since the type of the second param- 
eter depends on the first parameter. The use of { . . . } flags a as an implicit parameter that can be elided 
where its type is implied by the call site of Maybe. This happens in the occurrences of Maybe A in the 
types of the data constructors just and nothing: In Maybe A, the value of the first, implicit parameter of 
Maybe can only be a, the level of the set A. 

The same applies to implicit function arguments, and in most cases, implicit arguments or parameters 
are determined by later arguments respectively parameters. Frequently, implicit arguments correspond 
quite precisely to that part of the context of mathematical statements that is frequently left implicit by 
mathematicians, so that the reader may be advised to skip implicit arguments at first reading of a type, and 
return to them for clarification where necessary for understanding the types of the explicit parameters. 

While the Hindley-Milner typing of Haskell and ML allows function definitions without declaration 
of the function type, and type signatures without declaration of the universally quantified type variables, 
in Agda, almost all types and variables need to be declared, but implicit parameters and the type checking 
machinery used to resolve them alleviate that burden significantly. For example, the original definition 
writes only Maybe {a} (A : Set a) : Set a, since the type of a will be inferred from a's use as argument 
to Set. 
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The "programming types" like Maybe can be freely mixed with "formula types", inspired by the 
Curry-Howard-correspondence of "formulae as types, proofs as terms". The formula types of true for- 
mulae contain their proofs, while the formula types of false formulae are empty. 

The standard library type of propositional equality has (besides two implicit parameters) one explicit 
parameter and one explicit argument; the definition therefore gives rise to types like the type "2 = 1 + 1", 
which can be shown to be inhabited using the definition of natural numbers 1 and 2 and natural number 
addition +, and the type "2 = 3", which is an empty type, since it has no proof. 

data _=_ {a : Level} {A : Set a} (x : A) : A — > Set a where refl : x = x 

The underscore characters occurring in the name _=_ declare mixfix syntax with argument positions for 
explicit parameters and arguments; this mixfix syntax is already used in the type of the single constructor. 
The definition introduces types x = y for any x and y of type A, but only the types x = x are inhabited, 
and they contain the single element refl {a} {A} {x}. 

In Agda, as in other type theories without quotient types, sets with equality are typically modelled as 
setoids, that is, carrier types equipped with an equivalence. This closely corresponds to the non-primitive 
nature of the "equality" test (==) : Eq a =>■ a — > a — > Bool in Haskell. A setoid is a dependent record 
consisting of a Carrier set, a relation _~_ on that carrier, and a proof that the relation _~_ is an 
equivalence relation: 

record Setoid c / : Set (sue (cu/)) where 
field Carrier : Set c 

: Rel Carrier / 
isEquivalence : IsEquivalence 
open IsEquivalence isEquivalence public 

An Agda record is also a module that may contain other material besides its fields; the "open" clause 
makes the fields of the equivalence proof available as if they were fields of Setoid. This language feature 
enables incremental extension of smaller theories to larger theories at very low notational cost. 

Whenever we allow arbitrary node or edge sets, and we want to prove, for example, isomorphism of 
certain graphs, we actually need setoids and not just sets. For such contexts, we introduce the following 
abbreviation for extracting the carrier set from a setoid: 

_J : {c / : Level} — > Setoid c / — > Set c 
[sj = Setoid. Carrier s 



3 Set-Based Term Graphs 

We now present a simple definition of term graphs that is intentionally kept close to conventional math- 
ematical formulations. To reduce complexity and improve readability of this initial formalisation, we 
present untyped term graphs here; a typed valiant will be shown in Sect.|4] 

In the context of an arity-indexed label type Label : N — > Set, we first define a type DHGi of directed 
hypergraphs with one putput per edge, indexed by input and output arities of the whole graph, with the 
following components (since Agda records are also modules, they can contain additional material besides 
their fields): 
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• A setoid Inner of non-input nodes. (For simplicity, we do not emply universe polymorphism here, and 
all our setoids are of type Setoid zero zero.) 

For technical reasons, we find it more convenient to have the non-input nodes separate from the input 
nodes. Otherwise we would have had to include an explicit injection from the input positions to the 
complete node set. 

• The setoid Node of all nodes is then derived as the disjoint union of Inner with the setoid of input 
positions, which is obtained from Fin m, the set of natural numbers smaller than m. 

• The second field is the n-element vector of output nodes, which can be either input nodes or inner 
nodes. 

• For symmetry, we also provide the m-element vector of input nodes, constructed using allFin m which 
is the vector (i.e., array) containing all m elements of the set Fin m in sequence, i.e., 0, 1, . . . , m - 1. 

• Edge is the setoid of hyperedges. 

• elnfo maps each edge to a dependent tuple consisting of an arity k, a k-ary label, and a k-element vector 
of edge input nodes. 

• eOut maps each edge to its output node, which cannot be an input node of the Jungle, and therefore 
has to be an Inner node. (The function arrow between setoids is optically not distinguishable from 
the general function type arrow, but is technically a different symbol. Since setoids cannot be used as 
types, no confusion can arise.) 

• We derive the function eLabel that maps each edge e to its edge label. Since the arity of that label is 
not known in advance, the function eLabel returns a dependent pair consisting of the label arity k and 
a k-ary label. 

• We also derive the function eln that maps each edge e to the vector of input nodes of e; the type of this 
vector depends on the arity of e, which is the first component (proji) of the dependent tuple eLabel e. 

record DHGi (m n : N) : Seti where record Jungle (m n : N) : Seti where 

field Inner : Setoid zero zero field Inner : Setoid zero zero 

Node = Fin. setoid m WW Inner Node = Fin. setoid m WW Inner 

field output : Vec |_ Node J n field output : Vec [ Node J n 

input : Vec |_ Node J m input : Vec |_ Node J m 

input = Vec.mapinji (allFin m) input = Vec.mapinji (allFin m) 

field Edge : Setoid zero zero field Edge : Setoid zero zero 

elnfo: [Edge] elnfo: [Edge] 

-> E [k : N] (Label k x Vec [ Node J k) ->• E [k : N] (Label k x Vec [ Node J k) 

EOut : Inverse Edge Inner 
eOut : Edge— > Inner eOut : Edge— dinner 

eOut = Inverse. to EOut 
producer : Inner— > Edge 
producer = Inverse. from EOut 

eLabel : [ Edge J -> E [k : N] Label k eLabel : [ Edge J -> E [k : N] Label k 

eLabel e = Product. map id proji (elnfo e) eLabel e = Product. map id proji (elnfo e) 

eln : (e : [ Edge J ) ->• Vec [ Node J (proji (eLabel e)) eln : (e : [_ Edge J ) ->• Vec [ Node J (proji (eLabel e)) 

eln = proj 2 o proj 2 o elnfo eln = proj 2 o proj 2 o elnfo 

In this DHGi definition, eOut does not have to be surjective, which means that there may be "undefined 
nodes", and eOut also does not have to be injective, which means that there may be "join nodes" in the 
sense of [KAC06]. If bijectivity of eOut is desired, we can replace the setoid mapping with an inverse 
pair of mappings, and extract eOut and the producer mapping for inner nodes from that, as shown above 
to the right. 
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These jungles are isomorphic to conventional termgraphs, where inputs (as arguments) and labels are 
attached directly to inner nodes: 

record TermGraph (m n : N) : Set; where 
field Inner : Setoid zero zero 
Node = Fin. setoid m ttlttl Inner 
field output : Vec [ Node J n 
input : Vec |_ Node J m 
input = Vec.mapinji (allFin m) 
field label : [ Inner J -» E [k : N] Label k 

args : (n : |_ Inner J ) — > Vec [ Node J (proji (label n)) 

The following basic constructor functions are highly similar for DHGi, Jungle, and TermGraph; we show 
them here for Jungle. 

Using the one-element setoid T (with element tt), we can define primitive jungles consisting of a 
single hyperedge: 

prim : {k : N} — > Label k — > Jungle k 1 
prim {k} f = record 

{Inner = T 

; output = [inj2 tt] 

; Edge = T 

; elnfo = A _ — > (k, (f, Vec. map inji (allFin k))) 
; EOut = Inverse. id 

} 

For wiring graphs, we need empty sets (X) of edges and inner nodes: 

wire : {m n : N} — > Vec (Fin m) n — > Jungle m n 
wire{m}{n}v = record 

{Inner = X 

; output = Vec.mapinji v 

; Edge = X 

; elnfo = E.X-elim 

; EOut = Inverse. id 

} 

With this, we can easily construct the standard wiring graphs required for defining a gs-monoidal cate- 
gory (see Sect. [5]) of Jungles: 

idJungle : {m : N} — > Jungle m m 
idJungle = wire (allFin _) 

dupJungle : {m : N} — > Jungle m (m + m) 
dupJungle{m} = wire (allFin m + allFin m) 

termJungle : {m : N} — > Jungle m 
termJungle = wire [] 

exchJungle : (m n : N) — > Jungle (m + n) (n + m) 

exchJungle m n = wire (Vec. map (raise m) (allFin n) + Vec. map (inject+ n) (allFin m)) 

Separating the inner nodes from the inputs in particular has the advantage that for sequential composition, 
we can just use the disjoint union of the two Inner node sets: 
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seqJungle : {k m n : N} — > Jungle km-) Jungle m n — > Jungle k n 
seqJungle {k} {m} {n} g! g 2 = let 
open Jungle 

hi : [ Node gi J — > Fin k W ([ Inner gi J 1+1 [Inner g 2 J) 
hi = Sum. map id inji 

h 2 : L Node g 2 J — > Fin k W (|_ Inner gi J 1+1 [Inner g 2 J) 
n 2 = [(A i hi (Vec. lookup i (output gi))), inj 2 o inj 2 ] ' 
in record 

{Inner = Inner gi WW Inner g 2 

; output = Vec.map h 2 (output g 2 ) 

; Edge = Edgegi WW Edgeg 2 

; elnfo = [productMap 22 (Vec.map hi) o elnfo gi , productMap 22 (Vec.map h 2 ) o elnfo g 2 ] ' 
; EOut = EOutgi ©© EOutg 2 

} 

Parallel composition works similarly; here the input positions need to be adapted. 

parJungle : {mi ni m 2 n 2 : N} — > Jungle mi nj — > Jungle m 2 n 2 — > Jungle (mi + m 2 ) (ni + n 2 ) 
parJungle{m 1 }{ni}{m 2 }{n 2 }g 1 g 2 = let 
open Jungle 

hi : [ Nodegi J — > Fin (mi + m 2 ) l±J ([ Inner gi J l±J [ Inner g 2 J) 
hi = Sum. map (inject+ m 2 ) inji 

h 2 : [ Node g 2 J — > Fin (mi + m 2 ) l±J ( [ Inner gi J W [ mner g2 J ) 
h 2 = Sum. map (raise mi) inj 2 
in record 

{Inner = Inner gi WW Inner g 2 

; output = Vec.map hi (output gi) + Vec.map h 2 (output g 2 ) 
; Edge = Edge gi WW Edge g 2 

; elnfo = [productMap 22 (Vec.map hi) o elnfo gi , productMap 22 (Vec.map h 2 ) o elnfo g 2 ] ' 
; EOut = EOutgi ffi© EOutg 2 

} 



4 Typed Code Graphs 

Coconut code graphs [KAC06] have types associated with nodes, and hyperedges may have not only 
multiple inputs, but also multiple outputs, to be able to model operations that yield multiple results; the 
typing of the input and output nodes needs to be compatible with the operations indicated by the edge 
labels. 

For simplicity, we assume here a global set Type : Set of node types, and dispense with using 
setoids in this section. An edge label is now indexed by vectors of input and output types, so we assume 
Label : {m n : N} — > Vec Type m — > Vec Type n — > Set, and also define the dependent record type 
EdgeType for collecting these indices: 

record EdgeType : Set where 
field inArity : N 
outArity : N 

inTypes : Vec Type inArity 
outTypes : Vec Type outArity 
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An edge label then is such an index collection together with a label drawn from the corresponding label 
set; the open declaration makes the EdgeType fields available for EdgeLabel elements as if this was a 
record extension: 

record EdgeLabel : Set where 
field eType : EdgeType 

label : Label (EdgeType. inTypes eType) (EdgeType. outTypes eType) 
open EdgeType eType public 

For typed term graphs, there are many different ways to deal with node typing, and for any given way, 
different views are useful in different contexts. We will keep a node typing function as a field, and derive 
from this an indexed view of typed nodes, using the following general construct: Given a set A and a 
typing function type for A, the Type-indexed set Typed A type associates with every type ty all elements 
of A that have type ty; formally, an element of Typed A type ty is a dependent pair consisting of an 
element a : A together with a proof that type a = ty: 

Typed : (A : Set) ->■ (A -> Type) ->■ Type -> Set 
Typed A type ty = £ [a : A] (type a = ty) 

Since the Agda standard library does not provide a variant of Vec where the element types may depend 
on their positions, we directly use dependently typed functions starting from these positions instead, 
producing "typed vectors" with elements type according to the argument type vector v: 

Typed Vec : (A : Set) ->■ (A ->■ Type) -> {k : N} ->■ Vec Type k ->■ Set 
TypedVec A type {k} v = (i : Fin k) — > Typed A type (Vec. lookup i v) 

The Edgelnfo associated with each hyperedge then contains, besides an EdgeLabel, two such "typed 
node vectors", typed according to the label's typing information (for modularity, this definition is kept 
outside the code graph definition and parameterised with the type Nodes for "typed node vectors" to be 
supplied there): 

record Edgelnfo (Nodes : {k : N} — > Vec Type k — > Set) : Set where 
field eLab : EdgeLabel 

elnput : Nodes (EdgeLabel. inTypes eLab) 

eOutput : Nodes (EdgeLabel. outTypes eLab) 
open EdgeLabel eLab public 

A CodeGraph is now defined roughly analogous to a Jungle, with the following differences worth point- 
ing out: 

• Code graphs can be considered as "generalised hyperedges", and therefore have an EdgeType derived 
from the CodeGraph type parameters. Keeping the current parameters eases the implementation of the 
categorical view, in comparison with using the EdgeType as a parameter instead. 

• We only need to explicitly represent the typing of the inner nodes; from this we can derive the typing 
of all Nodes by looking up the typing of the input positions in inTypes. 

• A Typed Node ty is a Node with type ty; an element of Typed Nodes v is a "typed node vector" accord- 
ing to the type vector v. 

• The CodeGraph field output and each individual edge interface use Typed Node "vectors". 

• We can still provide lower-level interfaces to edges; we show functions that extract the edge label, edge 
input arity, and edge input Node vectors (discarding the type information), both dependently-typed and 
existentially-typed with respect to the vector length. (The corresponding functions eOut etc.are not 
shown.) 
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record CodeGraph {m n : N} (inTypes : Vec Type m) (outTypes : VecType n) : Setj where 
cgType : EdgeType 
cgType = record {inArity = m 

; outArity = n 

; inTypes = inTypes 

; outTypes = outTypes} 

field Inner : Set 

iType : Inner— > Type 

Node = Fin m 1+1 Inner 

nType : Node — > Type 

nType = [(A i — > Vec. lookup i inTypes), iType] ' 

Typed Node : Type — > Set 
TypedNode = Typed Node nType 

TypedNodes : {k : N} -> Vec Type k ->• Set 
TypedNodes = TypedVec Node nType 

field output : TypedNodes outTypes 

input : TypedNodes inTypes 
input = A i — >• (inji i,refl) 

field Edge : Set 

elnfo : Edge — > Edgelnfo TypedNodes 

eLabel : Edge — >• EdgeLabel 
eLabel = Edgelnfo. eLab o elnfo 

elnArity : Edge ->■ N 

elnArity = Edgelnfo. inArity o elnfo 

eln : (e : Edge) — > Vec Node (elnArity e) 

eln e = mkVec (proji o Edgelnfo. elnput (elnfo e)) 

eln' : Edge -»• £ [k : N] (Vec Node k) 
eln' e = elnArity e,eln e 

Again, eOut is not guaranteed to reach all nodes, and, due to the possibility of multi-output operations, 
this cannot be amended by joining the Inner and Edge sets as in jungles. This and other degrees of 
generality contained in this definition can be useful for certain purposes, but also can be forbidden for 
other purposes by adding appropriate constraints. 

We show the function for producing primitive one-edge code graphs: 

prim : (/ : EdgeLabel) -> CodeGraph (EdgeLabel. inTypes /) (EdgeLabel. outTypes /) 
prim I = record 

{Inner = Fin (EdgeLabel. outArity /) 

; output = A i — > (inj2 i, refl) 

; Edge = T 

; elnfo = A _ — ^ record {eLab = / 

; elnput = A i — >■ (inji i, refl) 
; eOutput = A i — > (inj2 i, refl)}} 

While type-checking the three propositional equality proofs refl in here, Agda actually proves that the 
mentioned types are indeed equal: An Agda program can only produce CodeGraph values that are cor- 
rectly typed, both on the external interface, and internally at each port of each edge. 
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5 GS-Monoidal Categories 

Corradini and Gadducci proposed gs-monoidal categories for modelling acyclic term graphs HCG991 ; 
extended discussion of how code graphs fit into this framework is contained in [KAC06]. Here we only 
present a quick summary, and tie this into the formalisation in Sect. [3] 

In a category theory context, we write "/ : si -^38" to declare that morphism/ goes from object si 
to object 38, and use ";" as the associative binary composition operator; composition of two morphisms 
/ : si ^-38 and g : 38' is defined iff 38 = SB', and then (f;g) : si . Furthermore, the identity 
morphism for object si is written \ s j. 

Jungle can be seen to define morphisms of an untyped term graph category where objects are natural 
numbers. (For CodeGraph, the collection of Objects is £ [k : N] (Vec Type k).) 

In the Jungle category, a morphism from m to n is an element of Jungle m n, that is, a term graph with 
m input nodes and n output nodes. More precisely, such a morphism is an isomorphism class of jungles, 
since node and edge identities do not matter; we will define a Setoid where the Carrier is Jungle m n and 
equivalence proofs are Jungle isomorphisms. 

Composition F ; G "glues" together the output nodes of F with the respective input nodes of G, as we 
have implemented in seqJungle. The identity on n consists only of n input nodes which are also, in the 
same sequence, output nodes, and no edges, and is therefore constructed as a wiring graph: 

idJungle : {m : N} — > Jungle m m 
idJungle = wire (all Fin _) 

Definition 5.1 A symmetric strict monoidal category [ML71] consists of a category Co, a strictly asso- 
ciative monoidal bifunctor <g> with 1 as its strict unit, and a transformation X that associates with every 
two objects si and 3$ an arrow ^ : si <8> 38 '— > 38 '<g> si with: 

X.,/®^ = (I^®X^);(X^<g>Lg) , Xi,i = h . □ 

For Jungle, the unit object 11 is the natural number 0, and <8> on objects is addition. On morphisms, 
® forms the disjoint union of code graphs, concatenating the input and output node sequences, as im- 
plemented in parJungle. X m .„ differs from l m+n only in the fact that the two parts of the output node 
sequence are swapped: 

exchJungle : (m n : N) — > Jungle (m + n) (n + m) 

exchJungle m n = wire (Vec. map (raise m) (allFin n) + Vec. map (inject+ n) (allFin m)) 

Definition 5.2 A strict gs-monoidal category is a symmetric strict monoidal category where in addition 
! associates with every object si of Co an arrow \^ : si — > 1, and V associates with every object si of 
Co an arrow :si—¥si® si, such that Ij =!i = Vj, and the following axioms hold: 

In Jungle, the "terminator" !„ differs from \ only in the fact that the output node sequence is empty. 

termJungle : {n : N} — > Jungle n 
termJungle = wire [] 
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The "g" of "gs-monoidal" stands for "garbage": all edges of a term graph G : m— >n are garbage in the 
term graph GA n . 

The duplicator V„ in Jungle differs from I„ only in the fact that the output node sequence is the 
concatenation of the input node sequence with itself: 

dupJungle : {n : N} — > Jungle n (n + n) 
dupJungle {n} = wire (allFin n + allFin n) 

The "s" of "gs-monoidal" stands for "sharing": every input of (F ®G) is shared by F : k^tm and 
G : k — ¥ n. 

Code graphs (and term graphs) over a fixed edge label set form a gs-monoidal category, but not a 
Cartesian category, where in addition ! and V are natural transformations, i.e., for all F : srf — > S3 we 
have F'Agg and F;V$g = V^'>(F<S>F). To see how these naturality conditions are violated by term 
graphs, the following five Jungles correspond to the expressions below them (we draw jungles and code 
graphs from the inputs on top to the outputs at the bottom, with numbered triangles marking input and 
output positions, and rectangles enclosing edge labels). 




Formalising (symmetric gs-) monoidal categories in Agda is a straight-forward extension of the standard 
type-theoretic formalisation of category theory deriving essentially from Kanda's "effective categories" 
HKan81l : this uses setoids of morphisms, but not of objects. This approach is also used by Huet and 
Saibi [HS98, HSOO] for their formalisation of category theory in Coq, and by Gonzalfa [Gon06 | for his 
formalisation of Freyd and Scedrov's allegory hierarchy [FS90] in Alf, a predecessor of Agda. 

This approach also corresponds to the general practice in category theory to consider objects only up 
to isomorphism, not up to equality. However, the definition of strict monoidal categories runs counter 
to this approach, by assuming an object-level operation ((g)) satisfying non-trivial object-level equations. 
Therefore we directly formalise what MacLane calls "relaxed" monoidal categories, with natural iso- 
morphisms a : iaf <g (^(g^) -)>(.£/ <g>^)(g^ and X : 11® srf^-s/ and p : <c/(g) 11 

This explicit approach also has advantages for moving between different levels of data nesting with- 
out requiring additional features; this is important for example for reasoning about the effect of SIMD 
operations together with SIMD vector manipulations on individual scalar values, which is necessary for 
verifying numerous high-performance "tricks", see e.g. l|AK08i 

6 Term Graphs as Let Constructs 

The code graph representation of Sect. [4] essentially is a typed variant of the current internal repre- 
sentation of Coconut code graphs, but, as mentioned in the introduction, we essentially write Haskell 
definitions to initially create code graphs. 
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In lazy functional programming implemented by graph reduction, since at least KRC [Tur82], local 
definitions (via let or where) are understood to introduce sharing. In a mathematical context, [AK94] 
represents cyclic term graphs as systems of mutually recursive equations, and [MOW98] presents sharing 
in the call-by-need A -calculus via let-expressions. 

In the following, we present two formalisations of term graphs defined by non-recursive nested let- 
expressions. For the sake of readability, we restrict ourselves to untyped term graphs and single-output 
primitives. 

With let-expressions, we automatically have to deal with the complications of bound variables, in- 
volving scoping, renaming to avoid variable clashes, etc. The Agda library NotSoFresh by Pouillard and 
Pottier [PP10] allows us to abstract from these concerns to a large degree, at the cost of following the 
discipline of their World-based programming interface. At the core of their approach, there are Worlds 
in which different variables are in scope; for a world a, the set of usable names is Name a. Introducing 
a new name happens via a "world extension link"; an element of a j8 is a weak link that provides a 

variable in /3 that might be shadowing one of the variables in a, while an element of a 1 > j8 is a strong 

link that provides, in /3, a variable that is fresh with respect to all variables in a. 

For programming and in mathematics, we are used to working in a context of weak links, while 
symbol manipulation systems, including theorem provers and compilers, frequently disambiguate names 
so that they can work with strong links exclusively. To enable both settings, we will parameterise over 
these "world Extension relation" with a parameter E : World — > World — > Set. 

We first present the type TG that formalises let-expressions with arbitrary nesting; this type is only 
a slight modification of the A -term datatype Tm from HPP101 . 

A value of type TG E a m n is, in the context of m input nodes and of a world a providing already 
existing inner nodes, a term graph "suffix" producing n output nodes: 

• The input node at position i can be produced as an output node by Input i. 

• An existing node x : Name a is produced as an output node by V x. 

• The empty suffix is called e. 

• Given two suffixes t and u of output lengths ni and ri2, their union, with concatenated output lists, is 
t V u. The symbol V reads "fork", as in the fork algebras of [HFBV97 1; it is related with the duplicator 
V via the equation t Vu = V ffl ;(t® u). 

• A primitive f can only be invoked while applying it to the outputs of a term graph suffix t and while 
at the same time creating a new node x in an expression of the shape Let x f t u, which, in more 
conventional notation, would read "let x = f (t) in u". 

If the primitive f expects k inputs, the argument term graph suffix t, which may not use the new name 
x because it is in the "old" world a, has to have k outputs. 

The term graph suffix u may use also the new name x, and its outputs will be the outputs of the 
"Let x f t u" expression. 

data TG (E : World -> World -> Set) (a : World) (m : N) : N -> Set where 



Input 

V 

£ 

_V_ 
Let 



(i : Fin m) ->TGEaml 
(x : Name a) — > TG E a m 1 
TG E a m 

{ni n2 : N} -> TG E a m ni -> TG E a m n2 ^ TG E a m (ni + r\2) 
{J3 : World} {kn : N} 
-4-(x: Eaj3) -letx 
— > (f : Label k) -> (t : TG E a m k) - = f(t) 
— > (u : TG E j3 m n) — in u 

^TGEamn 
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Without additional support, defining term graphs using this interface is somewhat inconvenient — the 
following assumes a unary label F, a binary label G, and a ternary label H: 

TGO : Label 1 -> Label 2 -> Label 3 -> TG 3 1 

TGO F G H = let fO = fresh0 -- a strong link 

xO = FreshPack.weakOf fO — weak view of fO 

nO = FreshPack.nameOf fO --Name of fO 

in Let xO H 

(Let xO F (Input zero) (V nO W nO) 

V 

Let xO G (Input (sue zero) V Input (sue (sue zero))) (V nO) 

) 

(VnO) 

Using slightly more conventional notation, this corresponds to the following, relatively readable version, 
with "i" prefixing inputs and "n" prefixing node names: 

let nO = H ((let nO = F (iO) in (nO V nO)) 

V 

(let nO = G (il V i2) in nO) 
) in nO 

Either by adding more notational support, or by defining a separate input language, this can provide an 
interface that comes reasonably close to Haskell-style programming. 

The real point of the definition of TG however is that it not only provides an input language, but also 
a representation of term graphs that can be manipulated and transformed by programs. For example, we 
can turn a TG with name shadowing (i.e., using weak links) into one with strong links by replacing all 
node names with fresh names relative to their respective worlds: 

strengthenTG : {a a' : World} ->■ Fresh a' -> CEnv (Name a') a 

-> {m n : N} -> TG amn^TG _ A — >_ a' m n 

strengthenTG £ = £ 

strengthenTG fr T (t V u) = (strengthenTG fr T t) V (strengthenTG fr T u) 

strengthenTG (Input i) = Input i 

strengthenTG frT (Vx) = V (lookupCEnv T x) 
strengthenTG frT (Letxf t u) 

= letr' = mapCEnv importWith r,x n> nameOf 

in Let strongOff (strengthenTG frTt) (strengthenTG nextOfT' u) 
where open Fresh Pack fr 

Parallel composition is also easy to program, using fork after embedding, respectively shifting, the inputs: 

parTG : {E : _} {a : _} {mi ni rri2 n2 : N} 

— > TG E a mi ni — > TG E a m2 n2 — > TG E a (mi + 1112) (ni + n2) 
parTG {E} {a} {mi} {nj} gi {m 2 } {n 2 } g2 = extendTG m 2 gi V shiftTG mi g 2 

Sequential composition is much harder to implement directly, since the output nodes of the first argument 
may have been defined in separate worlds and combined with fork, and now need to be brought into a 
common world, which in general requires renaming and restructuring. A convenient "canonical form" 
for such let-expressions has no Let at argument positions, and no Let below fork, and therefore degen- 
erates into a sequence of Let declarations each binding a new node to the application of some primitive 
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to existing nodes. When dealing with any kind of canonical forms, especially in a dependently-typed 
setting, it is frequently worth while declaring this as a separate datatype so that it becomes easier to 
exploit its properties. For this canonical form of TG, we introduce a separate datatype with additional 
restructuring below. 

7 Term Graphs with Sequential Node Declaration 

According to our explanation of TG term graphs, V with e obviously forms a monoid, but the monoid 
laws do not come for free in TG. Moving to the Vec container type instead provides us with the monoid 
laws in the standard library, and makes for a more canonical representation. With this change, and with 
strictly linearised node declaration, the term graph TGO shown above could be written in a somewhat 
conventional notation as follows (without fully specifying the number of inputs): 

let nO = F iO 
letnl = Gili2 
let n2 = H nO nO nl 
in [n2] 

We introduce the type Arg for individual nodes, either existing inner nodes, or input positions, and a type 
synonym Args for their vectors: 

data Arg a (m : N) : Set where 
Input : (i : Fin m) — > Arg a m 
V : (x : Name a) ->■ Arg a m 

Args a m n = Vec (Arg a m) n 

The datatype TG' has the same reading as TG, but a simpler structure: 

• If all nodes have been declared, Output as assembles the vector of output nodes. 

• Let x f v u, which, in more conventional notation, would read "let x = f (v) in u", binds a new node x 
to an edge labelled f with input nodes v, and makes x visible in the remaining term graph suffix u. 

data TG' E a (m : N) : N -> Set where 

Output : {n : N} -> Args a m n -> TG' E a m n 
Let : {j3 : World} {kn : N} 

-> (x: Ea|3) -- let x 

-> (f : Label k) (v : Args a m k) -- = f (v) 

-> (u : TG' E ft m n) -- in u 

-> TG' E a m n 

We first show that primitive and wiring graphs are easily programmed: 
prim : {k : N} -> Label k -> TG' J- >_ k 1 

prim {k} f = Let strongOf f (Vec. map Input (Vec.allFin k)) (Output [V nameOf]) 
where open Fresh Pack fresh0 

wire : {k n : N} {E : _} {a : World} — > Vec (Fin k)n^TG'Eakn 
wirev = Output (Vec. map Input v) 

idWire : {k : N} {E : _} {a : World} -> TG' E a k k 
idWirejk} = wire (Vec.allFin k) 

dup : {k : N} {E : _} {a : World} -> TG' E a k (k + k) 
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dup{k} = wire (Vec.allFin k + Vec.allFin k) 

term : {k : N} {E : _} {a : World} -> TG' E a k 
term = wire [] 

With these definitions, we can reconstruct the term graph TGO from above via the gs-monoidal interface, 
with sequential composition seqTG' and parallel composition parTG' defined below: 

tgO = seqTG' (parTG' (seqTG' (prim F) dup) (prim G)) (prim H) 

For the analogous function to strengthenTG, which replaces each link x in a Let construct with a fresh 
link, we present an easy generalisation to serve dual purposes: 

• Starting from weak links, strengthenTG' {_^_} id is proper strengthening; 

• starting from strong links, strengthenTG' {_ £ >_} StrongPack.weakOf is renaming with fresh 

names with respect to the new world a'. 

strengthenTG' : {E : _} -> (E _^_) 

— > {a a' : World} — > Fresh a' — > CEnv (Name a') a 
-> {m n : N} — > TG' Eamm TG' _ z — >_ a' m n 
strengthenTG' weak fr F (Output as) = Output (mapVarArgs (lookupCEnvT) as) 
strengthenTG' weak fr F (Let x f as u) 

= let r' = mapCEnv importWith r, weak x n> nameOf 

in Let strongOff (mapVarArgs (lookupCEnvT) as) (strengthenTG' weak nextOf F' u) 
where open Fresh Pack fr 

Both sequential and parallel composition are implemented by inserting the material of one graph between 
the innermost Let and the Output of the other graph. We define a general helper function for this purpose: 

inLet' : {a ]8 : World} -»• (s : a * z — ► J3) Fresh J3 -> {m n n' : N} 

-> ({7 : World} ->■ (s' : a * z — > 7) -> Fresh 7 
-> Args 7 m n — ► TG' _ z >_ 7 m n') 

-)• TG'_" — >_j3mn^TG'_- — >_j3mn' 
inLet' sfr F (Let xft u) = Letxft (inLet' (s > x) fr' F u) where fr' = StrongPack.nextOf x 
inLet' s fr F (Output as) = F sfr as 

We first implement fork, which walks the only primitively available fresh link fresh0 past all the Lets 
of gi, uses the resulting fresh link fr to rename g2, and afterwards adapts the output list asi of gi to the 
inner world of the renamed g2, so that the two output lists can be concatenated: 

forkTG' : {m ni n 2 : N} 

— ► TG' _ A >_ m ni 

— > TG' _ z >_ m r\2 

— > TG' _ z >_ m (ni + n 2 ) 

forkTG' {m} {ni} {n 2 } gi g2 = inLet' e fresh0 
(A {7} s' fr asi -> inLet' e fr 

(A s" _ as 2 -> Output (mapVarArgs (importC (* z y-C s")) asi + as 2 )) 

(strengthenTG' {_ z >_} StrongPack.weakOf fr emptyCEnv g 2 ) 

)gi 

The implementation of parallel composition then relies on fork in the same way as that for TG: 

parTG' : {mi ni : N} -> TG' _ z >_ mi ni 

-> {m 2 n 2 : N} -> TG' _ A >_ m 2 n 2 
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—> TG' _ z > _ (mi + nri2) (ni + (12) 

parTG' {mi } gi {rri2} g2 = forkTG' (extendTG' rri2 gi) (shiftTG' mi g2) 

Sequential composition follows the same pattern as forkTG', and first traverses the declarations of gi, 
which are preserved, but uses the helper function mapArgsTG' to properly replace any occurrence of 
inputs in argument and output lists of the renamed g2 with the corresponding output nodes of gi , after 
adapting them to the respective nested world. 

seqTG' : {kmn : N} 

-> TG' _ z — >_ 0km 

-} TG' _ z m n 

-> TG' _ z — >_ k n 
seqTG' gi g2 = inLet' £ fresh0 
(A {y} s' fr asi -> mapArgsTG' e 

(A s" as — »■ seqArgs (mapVarArgs (importC (* z >~C s")) asi) as) 

(strengthenTG' { _ z > } StrongPack.weakOf fr emptyCEnv g2) 

)gi 

Finally, it is also reasonably easy to convert a TG' term graph into a Jungle with Fin k as Inner node set 
and as Edge set, where k is the number of Let declarations. 

8 Conclusion and Outlook 

Formalising mathematical definitions of term graphs and their operations in Agda is a remarkably straight- 
forward exercise, and, due to the dependent typing of Agda, also carries over to typed term graphs much 
more easily than in the more restricted type systems of Haskell or higher-order logic. 

The remarkable abstract interface to variable binding provided by Pouillard and Pother's NotSoFresh 
Agda library HPP101 also makes name-binding representations of term graphs conveniently accessible 
to mechanised reasoning and programmed manipulation. Typing is easily added to our TG and TG' 
datatypes — the original Tm datatype provided as NotSoFresh example includes typing, but we omitted 
it here to improve readability. 

Implementing additional term graph operations, manipulations, and conversion functions, and prov- 
ing the algebraic properties of the term graph operations is ongoing work. 

Future work will strive to base code-graph based optimised-code generation algorithms for the Co- 
conut project [AK09a] on our Agda formalisations of code graphs, with a fully verifying tool chain as 
ultimate goal. 
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